Wikipedia Chemical Structure Explorer: substructure and similarity searching of molecules from Wikipedia

نویسندگان

  • Peter Ertl
  • Luc Patiny
  • Thomas Sander
  • Christian Rufener
  • Michaël Zasso
چکیده

BACKGROUND Wikipedia, the world's largest and most popular encyclopedia is an indispensable source of chemistry information. It contains among others also entries for over 15,000 chemicals including metabolites, drugs, agrochemicals and industrial chemicals. To provide an easy access to this wealth of information we decided to develop a substructure and similarity search tool for chemical structures referenced in Wikipedia. RESULTS We extracted chemical structures from entries in Wikipedia and implemented a web system allowing structure and similarity searching on these data. The whole search as well as visualization system is written in JavaScript and therefore can run locally within a web page and does not require a central server. The Wikipedia Chemical Structure Explorer is accessible on-line at www.cheminfo.org/wikipedia and is available also as an open source project from GitHub for local installation. CONCLUSIONS The web-based Wikipedia Chemical Structure Explorer provides a useful resource for research as well as for chemical education enabling both researchers and students easy and user friendly chemistry searching and identification of relevant information in Wikipedia. The tool can also help to improve quality of chemical entries in Wikipedia by providing potential contributors regularly updated list of entries with problematic structures. And last but not least this search system is a nice example of how the modern web technology can be applied in the field of cheminformatics. Graphical abstractWikipedia Chemical Structure Explorer allows substructure and similarity searches on molecules referenced in Wikipedia.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chemical Similarity Searching

This paper reviews the use of similarity searching in chemical databases. It begins by introducing the concept of similarity searching, differentiating it from the more common substructure searching, and then discusses the current generation of fragment-based measures that are used for searching chemical structure databases. The next sections focus upon two of the principal characteristics of a...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Distributional Semantics for Entity Relatedness

Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. In this work, we present a distributional semantics based approach for computing entity relatedness, and a focused related entities explorer based on this approach.

متن کامل

Context-Sensitive Ranking Using Cross-Domain Knowledge for Chemical Digital Libraries

Today, entity-centric searches are common tasks for information gathering. But, due to the huge amount of available information the entity itself is often not sufficient for finding suitable results. Users are usually searching for entities in a specific search context which is important for their relevance assessment. Therefore, for digital library providers it is inevitable to also consider t...

متن کامل

Maximum Common Substructure-Based Data Fusion in Similarity Searching

Data fusion has been shown to work very well when applied to fingerprint-based similarity searching, yet little is known of its application to maximum common substructure (MCS)-based similarity searching. Two similarity search applications of the MCS will be focused on here. Typically, the number of bonds in the MCS, as well as the bonds in the two molecules being compared, are used in a simila...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2015